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Documents are available from the Educational Policy Research Center at 
Syracuse in three formats, besides the regular publication, Notes on 
the Future of Education: 



RESEARCH REPORTS 

Reports which have completed review by the EPRC and which deal with 
specific, policy oriented research. The reports in this series are 
usually marked by intensive research, either quantified or histor- 
ical, and address themselves to specific research questions. 



EXPLORATORY REPORTS 

Reports which, while dealing with policy issues, often approach the 
realm of conjecture; they address themselves to social issues and 
the future, may be prescriptive rather than descriptive in tone, 
and are, by nature, more controversial in their conclusions. The 
review of these reports by the EPRC is as rigorous as that for 
Research Reports, though the conclusions remain those of the re- 
searcher rather than necessarily representing consensus agreement 
among the entire Center staff. 



WORKING DRAFTS 

Working Drafts are papers in progress, and are occasionally made 
available, in limited supply, to portions of the public to allow 
critical feedback and review. They have gone through little or 
no organized review at the Center, and their substance could re- 
flect either of the above two categories of reports. 



The research for this paper was conducted pursuant to Contract No. 

OEC-I-7-O70996-4253 with the Office of Education, U.S. Department of 
Health, Education, and Welfare. Contractors undertaking such projects 
under Government sponsorship are encouraged to express freely their 
professional judgment in the conduct of the project. Points of view or 
opinions stated do not, therefore, necessarily represent official Office 
of Education position or policy. 
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PREFACE 



Delphi, like the future it was intended to foretell, has not turned 
out to be what we expected. It displays certain fundamental weaknesses 
in its present form as a forecasting tool. Briefly, they have to do 
with interpreting the significance of convergence of opinion under con- 
ditions imposed by Delphi. The observation that people tend to shift 
their estimates toward a group norm under conditions of iteration is, 
on the basis of several controlled experiments with Delphi, a consistent 
and solid observation. There is some very meager evidence which suggests 
that compression of estimates over rounds produces a final consensus 
closer to the "true" answer (when the consensus is taken as a median of 
the spread of estimates) . This finding, however, is based upon evidence 
collected from very short-term predictions in the economic domain, and 
from experiments with almanac-type questions. Just how accurately the 
findings can be generalized to Delphis which cover a 30-year extension 
into the future is unknown. Moreover, to such a general tz^ . 

irrelevant to an understanding of plausibility in forecasting. Yet, 
interpreting the social-psychological significance of the convergency 
that does occur is important in understanding how the mind processes 
information about cne future. Once we can understand more clearly how 
the mind formulat s images of the future, we will be in a better position 
to improve upon the process of constructing rational and plausible fore- 
ca: ts . 



At present Delphi forecasts ccme up short because there is little 
emphasis on the grounds or arguments which might convince policy-makers 
of ..ne forecasts’ reasonableness. There are insufficient procedures to 



distinguish hope from likelihood. Delphi at present can render no 
rigorous distinction between reasonable judgment and mere guessing; nor 
does it clearly distinguish priority and value statements from rational 
arguments, nor feelings of confidence and desirability from statements 
of probability. 

Of equally great importance, however, our research also leads us 
to conclude that Delphi, in combination with other tools, is a very potent 
device for teaching people to think about the future in much more complex 
ways than they ordinarily would. When we understand this use of Delphi 
we may find that, as a general teaching strategy, it is useful and more 
important than as a forecasting device. What this means is that initially 
the way we want to get educators (in our case) to make better decisions — 
decisions which account for alternative consequences — is to help them 
think in more complex ways about the future. Delphi seems ideally suited 
to such a purpose. Indeed, educators may find in Delphi and other fore- 
casting tools a better pedagogy. One should not assume, however, that 
the weaknesses inherent in Delphi as a forecasting tool are its redeeming 
features as a teaching tool. Those weaknesses must be corrected if Delphi 
xs to l'.^e any use at all, heuristic or otherwise. 

Although Delphi was originally intended as a tool for scientific 
and technological forecasting, its more promising educational application 
seems to be in the following areas: (a) a method for studying the process 

of thinking about the future, (b) a pedagogical tool or teaching tool 
which forces people to think about the future in a more complex way than 
they ordinarily would, and (c) a planning tool which may aid in probing 
priorities held by members and constituencies of an organization. 
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DELPHI * A CRITICAL REVIEW 



I. 



WHAT DELPHI IS 

The presumption that one mark of the creative man is an ingenious 
ability to play variations on a theme has never been more pronounced 
than in Delphi studies. Hundreds of interludes have followed the Rand 
Corporation’s original composition. Although one never conducts a 
Delphi (one always conducts a "modified” Delphi), certain basic concepts 
have been preserved. In review these follow. 



The Exploratory Delphi 

The Delphi technique is a questionnaire method for organizing and 
shaping opinion through feedback. Its original use was to question 
experts as to their views about a chronology of scientific and techno- 
logical events, and particularly to collect their judgments as to just 
when the events might occur. ^ Delphi has been justified primarily on 
the grounds that it prevents professional status and high position from 
forcing judgments in certain directions — as frequently occurs when panels 
of experts meet. The intention was to assure that changes in estimates 
reflected rational judgment, not the influence of certain opinion leaders . 

We will return to this point later. 

Typically, the procedure includes a questionnaire mailed to respondents 
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who remain anonymous to one another. Respondents first generate several 
rather concise statements of events (or in some cases start out with the 
events already stated for them), and in the second round, give estimates 
as the probability of each event occurring at a given date in the future. 
Once the respondents have given their answers, the responses are collated 
and returned to each respondent who then is invited to revise his estimates. 
The third round responses are made with the knowledge of how others felt 
regarding the occurrence of each event. Again, the responses are assembled 
and reported back to the participants. If a respondent’s estimate does 
not fall within the interquartile range of all conjectures, he is asked 
to justify his position, whether or not he wishes to change his position. 

More recently, the technique has been extended to include questions 

about how familiar the participants are with the events. Respondents are 

also occasionally asked to rate the desirability of the events, should they 

occur. In addition, respondents are asked to give some statements about 

what impacts the events might have, if they occur. Still another question 

now being asked is what possible "interventions" might be developed to 

2 

erlt'r'X enhance or reduce the probability that an event would occur. 

A number of variations have been played on this theme, but essentially 
they all end up asking a panel (sometimes referred to as experts, sometimes 
not) to assign dates or probabilities or both to rather specifically stated 
future events. In one way or another, the dates and probabilities of other 
members are revealed. The form of that revelation is usually such that a 
majority opinion is conveyed — taking for example, the median and inter- 
quartiles, or the average of the group, or the mode of the distribution 
of responses, as the majority opinion. 

Regardless of the form and means used to establish the opinion feed- 
back, the purpose of Delphi is to engage people in conjecturing about the 
likelihood of an event occurring at a particular time in the future. It 
is deliberately intended in these studies that the nature of that conjecture 
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be shaped and changed by the feedback of opinions of others until a point 
of relative stability is reached. 



Normative Forecasting 



The basic idea in the exploratory Delphi, the deliberate shaping of 

judgments through informative feedback, has been uprooted and transplanted 

3 

in experiments with goal formulation. This use of Delphi is clearly 
normative. For the most part, these transplants from the original method 
differ as follows. Rather than speculating about what is probable within 
a given time frame in the future, the normative Delphi focuses on estab- 
lishing what is desirable in the form of goals and priorities. The idea 
of information feedback remains intact. However, the content of that 
feedback differs. Rather than revealing the dates and probabiliites 
others assign to future event statements, respondents in the normative 
Delphi learn the priorities which others assign to goal statements. For 
example, respondents might be asked to rank the following goal on a scale 
of highest to lowest priority: "acceptance of teacher trainees without 

prior educational prerequisites." The information revealed to the pan- 
elists in this case would take the form the average rank of the group. 

Thus, in principle, the normative Delphi differs from the exploratory 
Delphi in two ways. First, the substance has to do with what one thinks 
is desirable, rather than what one thinks is probable. Second, the nor- 
mative De3„phi may be thought of as not strictly temporal. Whereas the 
exploratory Delphi is always concerned with rather specific future dates, 
the normative Delphi is not. That is, the panelists usually are not asked 
to assign a specific date of occurrence to goals, although in some studies 
rather general time frames are implied such as "over the next decade and 
one-half." The main function of Delphi, opinion shaping through feedback, 
is common to both forms. 
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The normative Delphi clearly serves a different purpose in policy 
planning. Its use has been to assess the positions constituents and mem- 
bers of an institution (school, school districts, university, etc.) are 
likely to take on certain goals. It does not necessarily follow that the 
goals developed in this fashion have any intrinsic worth. Simply because 
there is agreement on a goal does not assure there is wisdom in its pur- 
suit. (The best example I can think of to illustrate this point is the 
vote of the United States Congress on the Gulf of Tonkin Resolution.) 

A committee vote no more assures that an objective is right, than a com- 
mittee vote insures the future will be what w T e expect it to be. In this 
sense consensus is neither a necessary nor sufficient condition for 
establishing the wisdom of an objective, nor is it a sufficient or neces- 
sary condition for establishing the plausibility of a forecast. 

Furthermore, deliberately shaping consensus on goals through feed- 
back will have little payoff for policy planning unless certain underlying 
assumptions are. bared in the process. This use of Delphi, like that of 
forcing consensus about future events, can be argued to be trivial on the 
same grounds . As presently construed, neither gives much attention to 
underlying assumptions. In the case of goals, no underlying rationale or 

motivations are aired to explain why a goal should be accepted as im- 

4 

portant . 

Ranks assigned to goals, based as they likely are on different and 
sometimes quite naive and even conflicting rationales, are in themselves 
of little value. It is not enough to simply say a goal is important. 

One must attempt to give the most powerful justification possible to such 
an assertion. Otherwise reasonable men have no rationale for rendering 
a decision, and the intention of such policy instruments, of course, is 
to aid in the making of decisions . 
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Family of Forecasting Tools 

Used in either of its modes, the Delphi Technique might be character- 
ized as a member of a family of forecasting tools. Let me identify that 
family. Although Delphi is the focal point of this study, it is one of 

several "intuitive" forecasting methods. Other methods include futures 

5 

history analysis, scenario writing, and cross-impact matrices. What 
distinguishes this family of tools from others? The answer to that ques- 
tion has to do with the explicitness of assumptions in the forecast. 

When people mentally construct an image of the future, it is pre- 
sumed they do so with some model in mind — a particular picture of how 
things are in the world. The model may be biased or unbiased, valid or 
invalid; it may be simple or complex. Yet, on the other hand one may not 
presume the presence of a model at all. We might presume just the op- 
posite. That is, what we observe in forecasting is not the influence of 
a model but simply a random process, mere guessing, or intuition of the 
vaguest sort so as to be nothing more than mere speculation. 

How can we decide whether a forecast is the product of a random 
process of guessing, or is the result of a particular view of the way 
things work? If the models and assumptions which support the forecast 
are not made explicit, then that question cannot be answered. 

Some forecasting methods entail the explication of a model while 
others do not. In some methods the underlying assumptions, sources of 
bias and error, degree of reliability, and the validity of inputs are 
simply unstated. That family of forecasting methods various writers 
have called "intuitive." Other methods in which the models, assumptions, 
and biases are stated, we will refer to as "empirical. Delphi and the 
other methods mentioned above are all examples of intuitive methods. 
Examples of empirical methods are trend extrapolation and econometric 



O 

ERIC 



5 



modeling. ' The intuitive methods produce results which are entirely sub- 
ject to unknown bias, but this is not the same as saying their results 
are incorrect; the point is that there is no way to assess "correctness,," 

The intuitive tools share some other common properties. They employ 
collective opinion or subjective judgment as basic inputs to the forecast- 
ing process in lieu of quantifiable data. In effect, they operate on the 
principle that several heads are better than one in making subjective 
conjectures about the future. It is assumed that experts, within a con- 
trolled intuitive process, will make conjectures based upon rational judg- 

I 

ment and shared information, rather than merely guessing, and will separate 

their own hopes and personal motivation from considered judgment in the 

process. That is, it is assumed that experts are experts because they are 

objective, take into account new or discrepant information, and construct 

logically sound deductions about the future based upon a thorough and 

disciplined understanding of particular phenomena and how they relate. 

Simply put, the methods are non-data-based and rely on collective expert 

judgment . (I will return to certain of these assumptions later as em- 

/ 

pirical matters.) 

Furthermore, the forecasts do not begin, as do extrapolations, with 
a demonstration of how future events grow out of specific present or past 
conditions. That is, these forecasts are not so much projections as they 
are quantum leaps into some future time frame in which one is left to find 
his way backwards to the present . 

In summary, intuitive forecasting 

(i) employs collective opinion or intuition as basic inputs 
to the forecasting process; 

(ii) does not begin, as do extrapolations, with a demonstration 
of how future events grow out of specific present or past 
conditions; 



(iii) does not necessarily reveal the models upon which their 
authors base their opinions nor their sources of inputs 
to the opinion formulating process; 

(iv) thus, reveals little in the way of an understanding about 
sources of bias, underlying assumptions, and the nature 
and the validity of inputs . 



Explanatory Power 

It follows from the above that if one ac yts a. Delphi forecast as 
plausib; a, he does so on the basis of blind th. The plausibility of 
a Delphi forecast, as no^ construed, can be det at i only on the basis of 
the extent of panelist agreement. But agreement alone is not a sufficient 
condition for arguing that a forecast is plausible and convincing (it is 
not even a necessary condition) . 

The nature of the Delphi method ought to be such that certain rather 
important distinctions could be made about forecasts and their underlying 
assumptions. For instance, we often fail to distinguish what is desirable 
from what seems plausible about the future. When we talk about something 
being desirable in the future, we use such words as "hope" or "goal." 

When we speak of plausibility, we use such words as "expect," "probability," 
or "likelihood." There is a fundamental distinction to be made, although 
often it is not. The purpose for making such a distinction is to separate 
forecasts of what seems likely — given certain factors — from what we would 
like to see happen, or like to avoid happening. It is not clear now how 
one can discriminate between statements in Delphi forecasts that are the 
products of hope as opposed to those which are products of rational prob- 
ability estimates. It is clear, however, that hope and desirability inter- 
fere with and to a considerable extent influence judgments about future 
events . 
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A second fundamental distinction needs to be made. In the absence of 
actually knowing in detail just what the future will be, one can either 
guess or judge. The very basis of the Delphi forecasting process is 
opinion as to when an event is likely to occur. It seems important, in 
establishing the plausibility of such opinion, that it be supported by 
rational judgment rather than merely guess-work. Delphi, at present, can 
render no such distinction because the arguments whr oport an opinion 

are not emphasized unless the opinion is contrary to he gr up r^rm. 

The failure to clarify and share assumptions is a ur umenr 1 failure 
of Delphi forecasts. Studying the future is in effect r 'king sump- 
tions we hold about the future. Stripping bare the under! ng a sumptions 
about the future often reveals that we present no alteruat es, cur thoughts 
are based upon very naive and weak arguments, and our judg ants ere the 
product of linear thinking. It is therefore crucial than nese tools 
heavily emphasize the explanations upon which the forecast rests. An 
intuitive forecast which carries with it no explanatory quality may be 
correct, but it would be trivial. That, is the singular weakness of Del- 
phi — in their present form, its forecasts have little substantive explan- 
atory quality. 

In order for a forecast to convince men of reason to take some action, 
on the basis of an argument form presented through a forecast, then the 
forecast must entail a plausible explanation of what is expected — both 
why one should be convinced to act, and why, if one failed to act, the 
consequences foreseen are the most reasonable consequences to expect. 
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II. 



APPLICATIONS OF DELPHI IN EDUCATION 



Normative Studies 



Delphi has been tried in educational planning on the assumption that 
almost anyone can forecast the future. As a result, many studies are 
without the benefit of some considerable understanding of the processes 
of futures thinking as well as policy thinking, and without the benefit 
of exact tools that facilitate a more complex consideration of the future 
than existed twenty years ago. The result so far has been to force atten- 
tion to the rather naive ability of educators to construct images of the 
future. 

So many studies have emerged that reviews have begun to appear, and 

8 

now with this report, reviews of reviews. Judd’s summary, for example, 
focuses on the application of Delphi and modified Delphi procedures in 
university planning. He cites in particular the work of Marvin Adelson, 
Olaf Helmer, Frederick Bolman, James Jacobson, Arnold Riesman, Samuel 
Cochran, E. S. Quade, Frederick Cyphert and Walter Gant, and Donald P. 
Anderson among others. In their work in university planning, Delphi has 
been applied in one way or another to cost-effectiveness, curriculum and 
campus planning, university-wide and state-wide educational goals and 
objectives, and evaluation. 

Like the vast majority of offerings on Delphi, Judd’s paper tends 
to be uncritical. He chooses instead to promote the application of Delphi, 
rather than to dig into its epistemology. Any consideration of the 
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literature on human information processing and future time perspective, 
both crucial to an understanding of the Delphi process, is lacking. 

One of the earliest uses of Delphi in educational planning was Helmer* 

9 

Delphi, which was incorporated as part of the 1965 Kettering Project to 
elicit preference judgments from a panel of experts in education and 
various fields related to education. The purpose was to compile a list 
of preferred goals for possible federal funding. Just what value this 
study had is left in doubt by the experimenters. Helmer concludes, "Al- 
though we believe that the compilation of a large number of ideas for 
possible educational innovations has served a useful purpose, not too much 

weight should be given to substantive findings resulting from these pilot 

„10 

studies. 

Additional Delphi studies are reported as experiments to elicit pre- 
ference statements from educators, or those with a direct interest in edu- 
cation. Most of these studies are considerably more focused than Helmer T s. 
Cyphert and Gant^ used Delphi as an opinion questionnaire to elicit pre- 
ferences from the faculty of the School of Education at the University of 

12 

Virginia and its clients, regulators, and constituency. Anderson used 
Delphi in a similar way in Ohio but limited the focus to a county school 
district. In the Anderson study, statements were obtained from teachers, 
board members, administrators, and selected educational experts. The state 
ments clustered in two sets: client services and organizational adaptation 

Using three Delphi questionnaires, priorities were assigned to the compiled 
set of goal statements independently using "zero sum" logic. 

In both the Virginia study and the Ohio study, most of the change in 
the priorities occurred after the first modal distribution was reported 
back to all respondents. Subsequent rounds failed to produce significant 
changes. The greatest disagreement on particular items in the Virginia 
study was on preparation of teachers at the graduate level without prior 
experience, and promoting uniformity of curriculum state-wide. The former 
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it preparation of .achers without experience, was ranked among the 
top ten priorities by the groups as a whole, but lowest by organization 
leaders and politician*' The latter item, a uniform curriculum, was ranked 
high by the non-teacher organizations, and low by the university and expert 
groups . 

These education studies differ in principle from the original use of 
Delphi. In the three studies, respondents were asked to focus on what they 
would like to see happen, rather than what they considered likely to happen. 
However, it is unclear as to how that would change the outcome of either 
type of experiment. It is not possible at the moment precisely to separate 
Delphi statements which reflect rational judgment from those which are 
based solely on feelings of desirability. When the task is speculating 
on the future, just what assumptions underlie one’s responses are unclear — 
unless, of course, those assumptions are specifically and systematically 
flushed out. 

Two other studies which focus on the goals held by various college 
populations and where Delphi procedures were used to examine values and 
goals are discussed below. 

13 

In a study at Education Testing Service, Norman Uhl investigated 
goal preferences of off- and on-campus groups. Through questionnaires the 
two groups were asked to judge the actual importance goals seemed to have 
at their respective institutions, as opposed to how important those goals 
should be at their institutions. The on-campus groups, all from South- 
eastern universities, included students, faculty, academic administrators. 
The off-campus group included trustees, parents of students, community 
leaders (politicians, representatives from the business, and religious 
communities, members of minority groups, and newspaper editors). The 
polarization on goal preferences which did indeed emerge between these 
groups after one session in the study disappeared with subsequent feedback. 
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Among the substantive findings was the none-too-surprising discovery 
that not one of the groups rated religious orientation high in actual or 
preferred importance. All groups rated intellectual development high in 
preference but somewhat lower in implementation. Self-study and planning 
were also rated high in preference but lower in implementation. The lowest 
rated preferred goals, other than religion, were national and international 
service. Perhaps the most surprising finding was the substantially high 
agreement among groups, particularly when they were often thought to be 
natural adversaries. Finally, as was the case in both the Cyphert and 
Gant study and the Anderson study, all of the statistically significant 
convergence occurred between rounds one and two^-after opinion feedback, 
but before each respondent's defense of his position was fed into the 
process of judging. But a different effect was seen regarding preferred 
goals. Significant convergence continued to occur after defense positions 
were presented. 

14 

Dalkey and Rourke investigated the use of Delphi in processing 
personal judgments about "quality of life" (WOL) as perceived by college 
students. The factors were generated by the students and refined by a 
clustering process in which students sorted the factors and then, in effect, 
pooled their feelings through a process of information feedback. 

Among the findings were shifts in cognitive factors, which at first 
ranked high, but were later moved to lower ranks when weighted according 
to the relevancy they were viewed as holding for quality of life factors. 
Although the quality of life factors highest in important were affective, 
i.e., "love, caring, affection," the education factors highest in importance 
were cognitive, i.e., "ability to learn, learning to learn, reasoning 
ability, ability to think, critical ability." The education factor seen 
as most relevant to "love" was the "ability to learn" factor. As an over- 
all educational factor, cognitive skills, when compared to other factors in 
terms of relevancy, dropped from first to seventh. Self-confidence as an 
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educational factor moved from eighth to first when relevancy weightings 
were assigned. 



Exploratory Studies 

The exploratory Delphi technique has been used in essentially its 

"pure" form in producing forecasts about the future of education. By 

"pure" I mean the deliberate use of information feedback to shape the 

opinion of anonymous judgers about the occurrence of particular future 

15 

events. As a pilot experiment at the San Diego meeting of the National 
Conference of Professors of Educational Administration, a Delphi was con- 
ducted by staff from the Institute for the Future, Middletown, Connecticut 
and the Educational Policy Research Center, Syracuse University Research 
Corporation. The major purpose was to collect conjectures about prospec- 
tive developments which might have an impact on educational administration 
their probable dates of occurrence, the desirability of such developments, 
should they occur, and their potential interventions. The study has never 
been formally reported. 

In Canada several studies adorn the growing body of educational 
Delphi studies. For instance, Berghofer ’ s^^ study was concerned with, 
general education in post-secondary institutions. Clarke and Coutts"^ 
examined the conjectures of teacher educators. 

Berghofer’s findings are extensive and beyond the scope of this 
paper. However, in brief, his study is important for two reasons. First, 
it is systematic. For example, for the most part, panelists for Delphi 
studies are selected arbitrarily and somewhat haphazardly. Berghofer’s 
procedure was quite thorough. Second, he modified the feedback procedure 
in a very significant way. The feedback each panelist received from 
round to round consisted of the arguments and rationales the other exper :s 



developed in defense of their opinion; dates and probabilities were not 
fed back to panelists. 

Berghofer found that statistically significant differences existed 
between the final predictions of young and old panelists. Differences of 
a statistically significant level were also attributed to level of self- 
appraised competency, level of educational attainment, and organizational 
position held. In general, the panelists who held highest degrees and 
also, as it turned out, held educational posts, tended to take the most 
absolute positions — checking "never" and "perpetual" most frequently of 
the groups. 

Overall, the dates selected by experts for each event after feedback 
of arguments and rationales tended to shift toward the future. Self-rated 
appraisal of competency tended to be reduced. Unfortunately, these changes 
were not statistically treated in the study. 

Substantively speaking, the greatest agreement among the panelists 
clustered in ten problem statements ( 80 % agreement on the year by which 
the experts thought a majority of people effected by the problems would 
clearly aware of them) . Berghofer reports "A synthesis of this opinion 
would indicate that the respondents looked forward to a society in which 
equality of opportunity is emphasized; quality of life is placed above 
quantity in life; leisure is used creatively; communication skills are 
stressed; concern is shown for major human problems, and a philosophic 
basis is sought for social, cultural, economic and medical changes." 

Clarke and Coutts found teacher educators generally agreed that teacher 
candidates would soon have to be skilled in the use of technology, that 
English usage would be an important criterion for evaluating teacher can- 
didates, and that knowledge and skill in process teaching methods, rather than 
product methods, would be essential. They also agreed almost unanimously 
that teaching skills would be required in individualization and group process 



as well as Id team teaching. The least agreement found in the study cen- 
tered on rather ambiguous statements about "change," and rather more specific 
statements about control of teacher education and certification of teachers. 



There are other studies reported to date using Delphi in essentially 
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its original form. Hudspeth conducted a Delphi study of perceived voca- 
tional education needs in New York State. The population of experts was 
selected from components of the vocational education system identified by 
the author as "in,” "out," "through" and "external." Particular attention 
was focused on "electro-mechanical technology and education." Dates and 
probabilities for each event were subdivided into responses of the four 
subgroups. All groups received the feedback of the four groups. Respond- 
ents were asked to rate each event (presumably should it occur) in terms of 
value it would have personally and value it would have to society in general. 
Respondents were also asked to identify each of the four subgroups that 
had "power" to enhance or inhibit the events. Respondents were finally 
asked what strategies they would choose co enhance or inhibit the events. 



Findings were not treated for statistical significance. The author 
reports, however, that the majority of events showed convergence but little 
shift in median date chosen. Events were generally seen as having more 
value for others than oneself. There was considerable agreement on the 
subgroups viewed as most influential in altering the occurrence of each 
event. Strategies for altering events were reported to be "poorly formu- 
lated" but tended to fall into five areas: more money for R&D, tax in- 

centives, lobbying, union pressure, increased public awareness. 
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Doyle and Goodwill conducted a Delphi for Bell of Canada on the 
future development and utilization of technology. The specific focus was 
on information systems (computer assisted instruction [CAI], computerized 
library systems, communication terminals, Audio-Visual retrieval systems). 
The researchers posed a number of possible developments and requested the 
panelists to judge their occurrence and also to add to the list. 







The substantive findings generally posed a rosy future for educational 
technology. The experts agreed there would be extensive development and 
widespread adoption of educational technologies during the late seventies 
and eighties. Generally it was felt that cultural values would be gradually 
changing to more openness to innovation, more insistence upon involvement 
and participation, and more educational practices oriented to the indi- 
vidual . 

Delphi was also used to develop long-range forecasts stemming from 

social in icators in a study conducted by the Institute for the Future, 

and sponsored primarily by the Educational Policy Research Center at 
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Syracuse University Research Corporation. The areas of concern were: 
urbanization; international relations; conflict in society and law enforce- 
ment; national political structure; values; impact of technology on 
government and society The project was part of a larger continuing 
methodological and substantive study of the future environment in which 
educational policies enacted in the near term might be expected to have 
some impact. The study was conceived, not to prepare a detailed descrip- 
tion of the future, but instead to examine expectations held by persons 
well-informed in several domains of the social sciences about the future. 

The study was intended orly to be an initial step and not a final piece 

of research. The substantive findings from these studies are summarized 
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elsewhere and are beyond the scope of this report. 

In brief, a number of difficulties were encountered in the research. 
First, there was no comprehensive theoretical framework to guide the in- 
quiry. Second, and fundamentally, the social science expectations did 
not carry the crispness of language and precision of judgment that the 
more rationalized process of technological change seemed to have in the 
original uses of Delphi. For instance, just when electric power plants 
driven by thermonuclear fuel will become widespread is a development 
controlled by several M knowable" technological factors. The same cannot 
be said of when alienation and impersonality of urban living will reach 





its maximum. Indeed, we do not even know what it means to speak of a 
"maximum" in this case. Third, the data base available to social science 
forecasting is shifting and often more unreliable than technological data. 
For example, data on the percentage of urban minorities is often not valid 
and its collection a matter of serious controversy. Fourth, even with the 
best of statistics, judgments in the social domain are subject to consider- 
able variance due to disagreement on the meaning of indicators, and thus 
forecasts are more likely biased by personal values than may be true of 
technological forecasts. 

Another societal Delphi study, using a format and design very similar 
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to the above, was conducted at the West rede Institute in Edmonton, Canada. 
The purpose was to prepare a series of forecasts on social conditions which 
tend to be important in educational planning. The six topics chosen by 
the researchers were: changes in value and social goal orientations; 

the family; leisure and recreation; intercultural relations; politics; and 
problems and needs of the individual. The purpose, according to the 
authors, was to be deliberately broad rather than achieve depth. No effort 
was made to determine possible impacts the forecasts might have on educa- 
tion (although the failure to do that makes the original intent of the 
study seem rather odd) . 

The procedures used did not include iterative feedback. Only two 
questionnaires were used — one requesting a list of forecasts, the other 
requesting dates, probabilities and rationales. 

The substantive results tended toward irony. For instance, the 
panelists viewed the future as holding much promise for the upgrading of 
humanist values (personal liberty, social consciousness, self-respect, 
etc.), but at the same time predicted seriously widening divisions between 
young and old, English and French, red and white, rich and poor, East and 
West. The panelists expected the education system will be more responsive 
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to the needs of students. Yet, they also expected disaffection to in- 
crease, and felt that nothing short of radical overhauling of the funda- 
mental structures and processes of education would be necessary. More 
specifically, the panelists felt conflict in higher education would worsen 
between student and institution. But they also expected authoritarianism 
to decrease, student participation in decision-making to increase, and 
curriculum reform to lean toward creativity, personal relationships, 
change process, leisure time. They also expected great increases in 
demand for continuing education. 

Although not clearly reported, the greatest amount of disagreement 
in the report seemed to be in human relations areas: law and order, 

violence, and alienation. The researchers summarize the salient findings 
in the following themes: aspirations and demands for social reform will 

outstrip actual reforms; society is in transition; many institutions are 
experiencing obsolescence; individuality and personal freedom will be 
upgraded; and individuals will be frustrated. ("The strongest theme 
among these forecasts pertains to the frustration of the individual.") 

In short, the forecasts anticipate the best of times and the worst 
of times. 

Finally, Delphi has been modified and linked together with other 

tools, not for the purpose of producing intuitive forecasts, but for the 

purpose of modifying the awareness, assumptions, and skills of the persons 
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making the forecasts. For example, Sandow constructed a simulation 
exercise which links together in a logical flow of activities the basic 
principles of Delphi, Cross-Impact Matrix, scenario writing, and analysis 
of future histories. 

There have been a number of other "first step" efforts elsewhere to 
recast forecasting tools such as Delphi into teaching tools. These effort 
are largely unreported to date. The "Ghetto 1984" game developed by 



Professor Jose Villegas at Cornell University bears noting, as does 
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the Delphi Exploration Game developed at the University of Illinois. 

In the University of Illinois project, initiated by Professor Charles 

E. Osgood, Delphi was used to create a computerized gaming device called 

Delphi Exploration. The general pattern of the game followed Future, a 

parlor-type game developed by Olaf Helmer and Theodore Gordon. Statements 

from prior Delphi research were used in the computer game. In addition, 
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the cross-impact matrix has been added. In Delphi Exploration the play- 
ers make investments in one set of future events in an attempt to move 
undesirable developments toward 0 percent probability while moving desir- 
able developments toward 100 percent probability. In the Delphi II program 
now under development, the player will be able to work through time from 
the present to some point in the future. In Delphi I, the operating pro- 
gram, the player simply tries to build what he considers to be a desirable 
world in the year 2000. It is the process through which players must go 
in Delphi Exploration that seems to be its objective as a teaching device. 



Some Criticisms 



There are several weaknesses inherent in the Delphi methodology as 
construed in these studies. First, there is the failure to distinguish 
between assertions, which may or may not be right, and their more important 
underlying explanations and assumptions which could be judged as reasonable 
or unreasonable. Second, it is assumed that consensus and plausibility 
are somehow connected. That is, if people agree on something, it must be 
right. We have argued elsewhere that in principle consensus is neither 
a necessary nor sufficient condition for saying something is plausible. 
Furthermore, consensus clearly does not mean that rational judgment was 
exercised in the process. It has been empirically demonstrated that agree- 
ment can be achieved even when agreement clearly runs counter to logic or 
observed reality. Third, the present applications of Delphi seem to 



represent "establishment futurology." The first of these weaknesses was 
discussed earlier and the second will be discussed in more detail in the 
next section. Let us take a moment here to discuss the third area of 
criticism — the tendency of Delphi to become an instrument of establishment 
futurology. 

This was given some attention in "An Interim Report on the Alberta 

Delphi Interaction Studies." Unfortunately, the criticisms did not survive 
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in the final report. However, we want to give it an airing here, together 
with some embellishments of our own. 

The Delphi studies reported above tend to be descrip- Jive rather than 
explanatory. They generally are surprise-free, suggesting nt major dis- 
continuities and implying that current trends will continue, perhaps more 
sharply, perhaps not. They are, in appearance only, val’t- reutral; how- 
ever, under the surface, they clearly present the views of en incipient 
bureaucracy. For instance, there is a failure to recognize the difference 
between "schooling" and learning"; this leads to the erroo anas conclusions 
that learning occurs mostly or even exclusively in schools, and that when 
the demand for learning increases, schooling must also expand. 

There is a serious confusion in the way problems are defined. The 
confusion is carried over into the future. For example, there is a per- 
sistent failure to distinguish between what schools do to individuals 
and what schools do about individual differences. Consequently, numerous 
forecasts confuse the problems of self-expression, alienation, and indi- 
viduality among youth with institutional proposals such as IPI. 

Finally, not one of the studies reported here includes the views of 
the radical political left. Establishment futurology is entirely character- 
ized by the talk of those who really are satisfied with their particular 
positions and roles and status, although in that talk, certain popular 
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metaphors and euphemisms of change are generously allowed. Those who are 
really dissatisfied, those whose ideas do not fit, will reject this mode 
of futures research, and probably would not participate in a Delphi even 
if asked — which is unlikely. 

The vast number of Delphis which have been run in various educational 
institutions suggest that there is something to the argument that Delphi 
has been seized as an instrument of establishment futurology The educa- 
tional Delphis are in no way startling or sensational. That is obvious 
to the most casual observer. There is a serious sterility in the process 
of summarizing mass information into numerous narrowly terse statements. 
There is a serious absence of any effort to probe beneath the surface for 
explanations In their make-up, Delphi panels cater to the power struc- 
ture, not the disenfranchised. Furthermore, the Delphi studies reviewed 
here suffer from technical limitations imposed by the methodology. Topics 
selected for consideration depend on the subjective judgment of the ex- 
perimenter or his panel. Specific content is particularly subject to 
experimenter bias because of the necessity to collate and summarize re- 
sponses. Choice of alternative response forms us subjective, and generally 
no provision is made for estimating the effects of greater or lesser alter- 
natives. There is no provision in the studies to check on the effect of 
wording, order of items or other devices that may influence the predict- 
ability of events . 

Delphi studies ought to be received critically, evaluated thoroughly 
and taken seriously — if they are actually believed to be an input to plan- 
ning. Now, unfortunately, that is not the case. 



III. 

EXAMINING THE RESEARCH BACKGROUND ON DELPHI 



Future Cognition: Two Traditic.s of Research 



There are two distinct bc.iies of research literature relevant to 
Delphi. The one literature is on the deliberate shaping of opinion through 
information feedback (which i- related tangentially to the vast literature 
on response set and personality) . The other is a literature about what I have 
begun to call "future cognitnon, " the study of human thought and the future. 

A detailed review of the literature on these topics is beyond the 
scope of this report. However, below is a brief excursion through one 
salient aspect of the future cognition literature. Other studies from 
other aspects of relevant research will be discussed where appropriate. 

There have been two notable trends during the last two decades in the 
study of human thought and the future. One trend has been the investiga- 
tion of future time perspective and personality traits. This research has 
been reported primarily in the literature on abnormal psychology. Inves- 
tigations have been concerned with social relationships, time perspective, 
and personality. Studies range from theoretical and experimental reports 
to speculative papers. 

The second trend appears in the literature of forecasting, public 
opinion, and information science.. These studies have been concerned pri- 
marily with accuracy of forecasters. Studies range from experimental to 





applied research, but differ fundamentally from the psychological studies. 

The central thrust has been methodological , i.e., oriented toward research 
and development. The research goals have been not so much to explain human 
behavior on the basis of some theoretical construct, but to improve, the 
forecasting techniques. 

The relation of these two trends could best be described as c: j incidental. 
They begin with a common problem, namely, the relation between hut an thought 
and tb future, but they have developed in isolation from one ano trier. 

There its little evidence of cross fertilization of data, ideas, constructs, 
or theories. An exhaustive check of the documentation shows two disparate 
and neatly isolated traditions of research — the experimenters in the one 
tradition not citing studies in the other. Yet, within each tradicion 
there is considerable continuity. 

While the specific research itself on Delphi is meager, the evalua- 
tion and reporting of even what little there is remains an arena of neglect. 
Few, if any, of the serious authors of Delphi reports (and we include all 
of those noted in this report) have investigated the two traditions of 
research in any fashion resembling an evaluative approach. This neglect 
is the primary reason for detailing the studies that follow. 

There is a clear difference in the focus of the two traditions of 

studies. On the one hand, experimenters in the time perspective studies 

have attended to personality traits without systematically testing the 

relation of such traits to the particulars of the experimental condition, 

i.e., level of ambiguity, level of uncertainty, degree of complexity, level 

of abstraction, the nature of feedback and types of reinforcement-all of 

which, it can be argued, are present to one degree or another in each of 
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the experiments . 

On the other hand, experimenters in the forecasting studies generally 
tended to focus on procedures, but not on the interrelationship between 




procedures and personality araits. The notable exceptions are the McGregor 
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study, and the more recent studies of Campbell, Weaver, and Waldron. 

For the most part forecasting studies have focused on the variance which 

could be accounted for by fa^ibi_ck of certain forms of information, nature 

cf reinforcement , task ccmp' sccity and the like. 



The Early Studies: Forecasting and Forecasters 

Nathan Israeli was c ie of the pioneers in subjecting forecasting to 
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empirical test. His work, however, has been criticized for containing 
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several methodological and conceptual defects. Israeli's work is none- 
theless mentioned here because several of the procedures he used or pro- 
posed for studying future cognition are now being used in Delphi forecast- 
ing techniques. 
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In one study, for example, subjects made a series of qualitative 

and quantitative judgments about future events. Subjects were asked to 

respond to future events in the following ways: (a) set a date for their 

occurrence, (b) select the most probable development among alternatives, 

for a specific future date, (c) select the most probable outcome from 

among alternatives for a given situation at a stated future time. These 
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three tasks have been further experimented with by McGregor, Kaplan, 
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and Helmer. In other studies Israeli experimented with wishes of college 
students regarding "improbable" future events in an effort to explore con- 
flict between wishes and reality*^ and student emotionality toward past, 
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present, and future. 



Israeli developed a series of ten experimental designs in all to be 
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conducted over a period of time. The results of these experiments un- 
fortunately were incompletely reported in the literature. However, the 
research designs were important in terms of questions, assumptions and 
rationales. Among the ten designs are the following which have been given 



attention by experiment n one way or another over the last three 

decades. The designs a. raise some very important questions not explored 
to date. 

One of the designs ..Id have tested the degree to which known infor- 
mation affects a predi. r This was later tested by McGregor. Israeli 

was concerned about the gree to which projections would follow logarithmic 

curves when subjects am cn- are not given past information about the 
event. A second desig: : _d have tested the degree to which "dogmatic 

and more liberal or ima e naive" individuals would perceive remote and 
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near futures. Roberts _nn Bonier later explored this question, but 
with results Bonier int=. reted as contradictory to theory. This question 
is important because of i-_ raeli’s assumption that extension into the future 
is accompanied by increasing variability in thinking about different situ- 
ations, and his assumption that dogmatism is perhaps relevant to such 
increased variability. 

A third design would have subjects rank the importance of certain 
eminent authorities at various times in the future as they might be ranked 
by those living at the time . Still another design would have asked sub- 
jects to name a period of mime in which catastrophes occurred by certain 
areas, e.g., chemical, sociological, etc., and to name a factor contribut- 
ing to the catastrophe. A fifth experimental design dealt with probability 
or certainty felt by t'~ subject that an event would occur in a given 
period of the future. ,_.i a particular aspect of probability, Israeli sug- 
gested using the catastrophes elicited in earlier designs for a set of 
stimuli to which subjects would assign probabilities. The assignment of 
probabilities to the occurrence of future events is a major source of 
data for the Delphi technique and the related cross- impact matrices. 

In short, the Israeli studies are significant not so much for elegance 
of findings, but for ti. ary basic questions he raised about human thought 
and the future. These cose tions have continued to be explored for more 
than thirty years. 
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Two important studies followed the Israeli research. McGregor dealt 
with the problems of "predeterminers” of prediction, i.e., attitudes, wishes 
and beliefs, together with "objective conditions that have been present 
in the immediate past in the environment of the predictor." His assumption 
was that these factors taken as a whole determine the premise underlying 
predictions. 

McGregor found that the highest probabilities of occurrence for an 
event, regardless of the respondent’s feelings of desirability, were regis-- 
tered in situations thought by the respondent to be unambiguous. McGregor 
concluded that predicting occurrence of events seemed to be coerced by a 
reduction in the ambiguity of the situation surrounding the event* McGregor 
also found that the greater the ambiguity of the situation and the import- 
ance of the event to the predictor, the closer the prediction corresponded 
to attitude. Familiarity was not related to differences in 8 out of 9 pre- 
dictions. Although the experts were "much better informed c;n the average," 
their predictions did not differ significantly from those of students when 
their attitudes were roughly the same. 

The McGregor experiment is particularly significant because he began 
to explore the interrelationship between dispositional factors (attitude, 
etc.) and the nature of the judgmental conditions (ambiguity, uncertainty, 
subjective probability) . 
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A study similar to McGregor’s work was conducted by Cantril. Cantril 
was interested in exploring to some degree the following kinds of questions. 
Can predictions of the timing of an event be made with as t&uch certainty 
as predictions regarding the actual outcome of an event? Am there differ- 
ences in the accuracy with which local and geographically distant events 
are predicted? Are there differences in the certainty with which immediately 
likely and distant (near and remote) events are predicted? What is the 
comparative accuracy of predictions of individuals versus groups? Are "men 
of affairs” more certain of their judgments than academicians? What is the 
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effect of the attitudes of predictions? What are the circumstances under 
which attitudes contradict predictions? 



Cantril found that all of the respondents were more certain that an 
event would occur some time than they were about Lne exact date of occur- 
rence. He found that attitudes, as determined by an eight item survey, 
tended to be related to affirmative predictions of events in the direction 
of the attitude, e.g., socialists tended to affirm such "socialist events" 
as federal control of electric power. 



In addition, Cantril found that events which appeared to lack relevant 
facts for a predictive base were judged with little certainty, and that 
academics were less certain of outcomes of events than "men of affairs," 
e.g., bankers, insurance executives, newspaper editors, etc. 



The McGregor and Cantril studies differed in the following respects: 



(i) McGregor attempted to relate personality attributes to 
accuracy of event probabilities. Cantril could not 
control for accuracy because only 2 of 15 events in 
his study actually occurred during the study. 

(ii) Cantril explored the differences in probabilities 

assigned to date of occurrence as compared to actual 
likelihood of ever occurring. McGregor did not explore 
this question. 

(iii) Cantril explored the question of whether academics and 
nen of affairs differed on certainty of predictions 
and whether certainty was related to probability. 
McGregor examined th= differences between academics 
and students. 



It should be noted that neither the McGregor study nor the Cantril 
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More Recent Studies on For£ daSt€.frg_and_ Forecasting 

An experiment similar ta two above, was conducted by Abraham 
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Kaplan. Kaplan raised about three basic areas of prediction: 

evaluation, improvement, ahd qpPiaisal. With regard to evaluation, Kaplan 
was concerned with these q* l e£ti°h£* • How successfully can predictions of 
social and technological e v ^t:s he made? Do social and technological 
events differ with regard to S£ a lsility of predictions? How precisely can 
predictions be made? Are differences in predictions of near and 

distant future events? 

Kaplan defined three bq£io ta s earch problems in the area of improve- 
ment: improvement of prediction f e liability by taking the mean estimates 

of the probabilities of ev e titsj improvement of accuracy by weighting the 
probabilities according to pilo r performance of experts; improvement by 
collective group predictions compared to a number of predictors working 
independently. 

Regarding appraisal, ^s s #ntial problem was viewed as one of 

"specifying subpopulationg of predictions in which the probability of suc- 
cess remains relatively stable/" Specifically the stability question was 
related to confidence and Pt e cf s i<2h of estimate. The question was whether 
predictions made with high c°hf id^hce are more likely to be successful than 
those made with less conffd e no^* 

A twenty-week limit op. each event to be judged in the study. 

The event could be confirmed of n°t within tx- T enty weeks. In 
each questionnaire the re^Po^d^ht Was given four exclusive predictions to 
which he was asked to assign pf Q b a k fifties of occurrence from 0 to 100. 
Values for the four alterP^t-i-V^ were to sum 100. In addition, space was 
provided for an open-ended gtaf^m^ht of "Basis for Your Judgment." 
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The questionnaires were distributed weekly for 13 weeks to all respond- 
ents. One-half of the predictors worked together on each new set of pre- 
dictions in quartets split as foljows: (a) an independent group answered 

individually as usual; (b) a cooperative group discussed the questions, 
but answered them individually; (c) a joint group discussed the questions, 
came to a collective decision and gave one answer for the entire group. 

The participants rotated among the three groups. 



Kaplan found that the relatively near future was more accurately pre- 
dicted than the relatively distant future. Prediction success varied in- 
versely with its scope in time. Five months was the longest interval of 
time considered. 



The entire group on all questions had a success mean of 53 percent. 

That is, in 53 percent of the cases where highest values were assigned, 
those cases were verified. Random success would have been 25 percent. 
However, predictors who were often right were, scarcely more definite than 
consistently wrong predictors, definiteness being the degree of success 
above 25 percent. 

Natural science events were no more successfully predicted than social 
events — despite the fact that most predictors were not social scientists. 
However, Kaplan found that social events were predicted with more confidence 
than science events. 



Knowledge of specific events was not related to successful predictions 
of specific events in the study. Kaplan also found that a statistical 
averaging of independently made forecasts yielded a success rate equal to 
group formulated predictions. However, joint group efforts and cooperative 
group efforts (discussion and the independent predictions) were superior 
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to predictions by the same individuals when not participating in groups. 



Finally, Kaplan found that one’s justification of the basis of his 
judgment was related to successful prediction — justification was defined 
as a statement of logical warrant for prediction. Examples were factual 
elaborations of details of question and answer, evidence of specific em- 
pirical generalizations, hypotheses about motivations of predicted behavior, 
analysis of time required for the event to occur. "Guesses” (suspected by 
Kaplan to be systematic or "educated ’) as the stated basis of prediction 
were successful in 40 percent of the cases — significantly better than 
chance . 

Kaplan’s study is quite significant in the tradition of research on 
prediction behavior. However, a discussion would not be complete without 
citing some of the disclaimers in his research. 

First, topics selected for the questionnaires depended on the sub- 
jective judgment of the experimenters with regard to such factors as likeli- 
hood of occurrence of an event within the five month interval, and the 
intrinsic difficulty (level of complexity) of each prediction. The specific 
content of the questions was subject to experimenter bias. Second, choice 
of alternatives was subjective and the study provided no basis for estimat- 
ing the effects of greater or lesser numbers of alternatives, or allowing 
the predictors to specify alternatives in an open-ended fashion. Third, 
there was no opportunity to check on effect of wording, order, or other 
devices on predictability of events. Fourth, the short time span in the 
study had the effect of tending to force selection of questions for the 
questionnaire from among potentially rapidly changing events; the unexpected 
consequence of this was to enhance predictability of certain items on the 
basis of obvious forecasts of "No Change." The first three of these 
limitations would apply in general to all Delphi studies. 
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Feedback Studies 



Norman Dalkey has for several years engaged in a series of studies 
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of group formulated opinion at the Rand Corporation. His interest 
parallels that of Kaplan, and also that of the earlier McGregor and Cantril 
studies. Dalkey is particularly interested in the question of improved 
(more accurate) group judgments through the use of controlled feedback. 

In Dalkey T s experiments, the questions have typically been drawn from 
almanacs and therefore the answers are of a factual nature which can be 
confirmed or disconf irmed. They are in that sense atemporal. The ques- 
tions do not in themselves demand any consideration of the future. Whether 
estimates made under such circumstances bear any relevance to Delphi fore- 
casting is a matter still untested. The judgmental tasks inherent in 
forecasting might be presumed to differ on logical grounds and may differ 
on psychological grounds as well. 

Two basic problems were investigated in the Dalkey studies? comparison 
of face-to-face discussion with controlled feedback and improvement of group 
estimates using an iterative forir of information feedback. 

In general, Dalkey found, "more often than not," that face-to-face 
discussion tended to make estimates of the group less accurate, whereas 
controlled anonymous feedback made the group estimate mere accurate. 
Specifically, he found that the median response of the questionnaire 
group was more accurate in 13 cases out of 20 and the discussion group 
more accurate in 7 cases. The result was not statistically significant. 

In an a_ posteriori experiment using smaller groups but giving them 
anonymous feedback from a Delphi questionnaire prior to their forming 
groups, Dalkey found that discussion after the first round produced more 
accurate estimates, but further discussion also produced more inaccurate 
answers. 
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Dalkey tested the widespread belief that, in the controlled feedback 
process? group "agreement" or convergency means answers are more likely to 
be correct than if the group’s response remains widespread. The correla- 
tion between standard deviation (spread) and accuracy produced, in Dalkey ’s 
words, a "disappointing result" (statistically signifies it, "but not high 
enough to be interesting") . Dalkey, in comparing estimates to a random- 
ized set of answers, found that differences were "heavily masked by chance" 
on the first round. 

Dalkey also found that repeating the feedback from round to round had 
the effect of closing the spread and also improving the medians of some 
answers while reducing the accuracy of others. For about 64 percent of 
the changed estimates, the median improved in accuracy; but for 36 percent, 
the median became less accurate. He found as wexi that the respondents 
closest to the median on the first round were the most accurate and also 
less likely to change. After iteration the swing group became more accurate 
as the median shifted in the majority of cases toward the "true" answer. 

Finally, it is clear that the group norm is much stronger than the 
effect of the "true" answer. That is, the convergence is consistently to- 
ward the group norm (median) independently of whether the norm moves toward 
the "true" answer. This, of course, leaves unanswered two fundamental 
questions. Under what conditions will people readjust their estimates to- 
ward the norm, and what are the effects of personality in such a process? 

Is there even the slightest relation between convergency and "correctness" 
in estimates and, if so, how could one explain it? 
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In conclusion Dalkey cited three basic findings on prediction that 
have emerged from the experiments at RAND: pronounced convergence of 

opinion occurs after feedback; the major part of convergence takes place 
between the first and second rounds; and, in cases where accuracy could be 
checked, the accuracy of group responses increases with feedback. Finally, 
Dalkey reported that considerable variance existed in performance on 
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different questions in the experiments. Split-half reliability on ques- 
tionnaires ranged from .4 to .6. In Dalkey's words, not high enough to 
"measure with." 

Brown^^ also examined the question of accuracy in prediction. The 
study did not attempt to explain why accuracy should or should not occur, 
or why accuracy might vary in a given population. Like the Dalkey experi- 
ments Brown’s study did not deal with future events, but instead also used 
almanac-type questions. In Brown’s study twenty-three RAND researchers 
were used as subjects. Twenty questions were submitted to them. Eighteen 
of the questions varied in content but could be answered with factual in- 
formation from the World Almanac . The remaining two were mathematical 
questions that could be computed but with some difficulty. Each respondent 
self-rated his confidence on each estimate. Questions and responses were 
submitted to respondents over four rounds. Each round requested revision 
of an answer and, if the answer were outside the interquartile range, to 
state reasons for divergency. 

Brown found that medians tended to move more closely to the correct 
answer over succeeding rounds, but the interquartile ranges converged away 
from the correct answer, viz., as the range of estimates decreased, the 
correct answer was no longer included in any of the middle 50% of the esti- 
mates. The "ball park” answers (within 25 percent of correctness) increased 
from 21 percent to 38 percent over four rounds as calculated from the 
medians. Quartiles containing the true answer decreased from 13 out of a 
possible 20 to 7 out of 20 over four rounds. Brown also found that the 
sub-group estimaters who rated themselves highest in confidence had collec- 
tively better median success than the average. 





Personality Influence 



Three recent studies have investigated the effects of personality 
influence on the outcomes of Delphi forecasting. Campbell^ compared the 
effectiveness of the Delphi questionnaire technique against group-discussed 
forecasts. The forecasts were concerned with specific economic indicators 
such as GNP. The projections were made three months in advance of confirm- 
ing data . 

Campbell found that the Delphi group estimates decreased more in 
interquartile range than the discussion group estimates. However, the 
convergence that occurred in the Delphi process tended to exclude the "cor- 
rect" answer, and the exclusion process increased over rounds. Campbell 
also found that individual estimates in the Delphi experimental group were 
more accurate, in the sense they deviated less from the correct answer, 
than individual estimates in the discussion groups. His data also reveal 
that the Delphi sample, as a group, was not more accurate to begin with, 
but tended to improve over four rounds. .Campbell found that, in general, 
as a group, Delphi forecasts were more accurate than the discussion group 
forecasts. 

Campbell also found that self-confidence (self-rated) tended to be 
related to accuracy, but he states that "selecting the most self-confident 
members of a group . . . was not an effective means of identifying the most 
accurate forecaster" (p. 112) . 

Campbell found that in the discussion groups frequency of participa- 
tion, as perceived by the group, tended to be related to the groups’ per- 
ception of influence and competence and even though the substance of sub- 
sequent forecasting tasks changed, the same people were perceived to be 
most influential and competent. The data were inconclusive in determining 
whether "inf luentials" were also more accurate as a subgroup of forecasters. 





In about hall of the cases they were more successful, but in an equal number 
of instances they were not. The data did suggest that accuracy of the group 
tended to be a function of the accuracy of the most influential forecasters. 

Finally, and perhaps most significantly, two measures of personality 
traits were predictive of certain behaviors — both in the discussion and in 
the Delphi experimental groups. Campbell found that participants with "in- 
clusion” and "affection" needs (FIRO-B scale) tended to be persuaded to 
change more frequently in the discussion groups. He also found that Delphi 
was not immune to such conformity- induced behavior. In the Delphi experi- 
mental groups ; participants with high inclusion and affection needs accounted 
for a part of the convergence. They were significantly more conformist as 
a subgroup than others. 



EPRC Research 



Research conducted at the Educational Policy Research Center at Syra- 
cuse has been primarily an investigation of how human information processing 
(conceptual level) is related to prediction. It was assumed in our study 
that the spread in estimates made by forecasters could be predicted by their 
conceptual level and that their estimates would chance in predictable ways 
under different conditions in the experiment. Our study was also concerned 
with objectivity as an influence on Delphi outcomes, and whether Delphi 
is immune to different propensities of forecasters to conform. 
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With graduate students in the field of education. Weaver found that 
conceptual level is clearly related to the outcomes of Delphi forecasts. 
Deciding how far apart to place earliest and latest dates of expected oc- 
currence of future events was a task distinguished by conceptual level. 

When information cues were given to help forecasters decide when an 
event might occur, concrete persons (low conceptual level) narrowed the 



distance between their earliest and latest dates. However, when estimating 
the occurrence of future dates was open-ended, concrete respondents widened 
their estimates greatly. This effect is particularly significant when it 
is considered that abstract persons (high conceptual level) did not differ 
significantly from treatment to treatment. 

It was also found that the dates assigned by abstract persons cor- 
related significantly with their self-rated feeling of "desirability" 
regarding the events. However, the correlation was not significant for 
concrete persons. The complex thinker seems to be no more "objective" 

(and perhaps less so) than concrete persons in assigning dates to future 
events. This, of course, is contrary to expectations. 
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Waldron’s dissertation, which primarily reports a replication of the 
basic work in the above study, dealt with the propensity of concrete persons 
to converge their estimates over rounds of the Delphi. In Weaver’s study 
described above there was no iteration of rounds. Subjects in Weaver’s 
study were randomly assigned to treatments which approximated three rounds 
of Delphi, but no repeated measures were employed. In Waldron’s study, 
high and low conceptually complex persons were subjected to three rounds 
of Delphi over time. Waldron reconfirmed that feedback of dates narrowed 
the estimation ranges of concrete subjects but did not significantly affect 
the highs. He found also that lows had a greater propensity to change 
estimates across rounds, and to change their estimates toward the norm of 
the controlled feedback. 



Notes on the Recent Research 



Although there appear to be some similarities, these studies differ 
somewhat. Campbell was interested in determining whether Delphi is superior 
to "uncontrolled discussion" in producing more accurate forecasts. Weaver’s 
and Waldron's studies were not concerned with accuracy. Therefore, while 



che three studies confronted subjects with future events, Campbell's 
events were of an entirely different nature. They could be quantified 
and covered only a chort time span. Moreover, in Campbell's design sub- 
jects were not asned when an event would occur, but instead were asked 
to make projections, e.g., what the GNP would be three months hence. In 
both Waldron's study and in Weaver's study, subjects were asked to assign 
an earliest possible and latest possible future date to rather general 
events, e.g., "widespread use of sophisticated teaching machines." The 
events could be viewed as extending several years into the future. 

Weaver's research was concerned with the differential effects of such 
experimental conditions as fixed alternatives on persons having different 
conceptual levels. Campbell's experimental treatments were designed to 
find out whether the Delphi method would produce more accurate results on 
short-term projections than would a committee. 



All three studies did attempt to explain why convergence occurs by 
demonstrating that people with certain personality traits conform to an 
information norm while others — with different traits — do not. Campbell's 
study is significant in demonstrating that in both group discussion and 
in the Delphi process personal needs for inclusion and affection 
account for a part of the convergence. Both Weaver's data- and Waldron's 
data suggest that conceptual level accounts for considerable convergence. 
It is probably reasonable to assume that the traits measured in all 
three studies are related, and are predictive of conforming behavior under 
conditions where specific knowledge of the "correct" answer is lacking, 
regardless of whether the content of the experimental question is the 
past, present, or future. That assumption, of course, needs to oe tested. 



It should be noted that the experiments by Dalkey and Brown 

should be interpreted carefully in regard to the prior research of Cantril, 
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Kaplan, and McGregor, and the more recent studies of Weaver, Waldron, 
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and Campbell. The findings of Dalkey and Brown are not necessarily 
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relevant to any temporal consideration whatever. Just exactly what judg- 
mental tasks are involved in these studies is not clear; they may include 
things such as "ranking, " 'festimating , 11 "computing," and so forth. It may 
well be that such judgments are in fact closely related to judgments made 
about the future. This, however, is an unresearched assumption. 

It is also important to note that, except for Waldron’s and Weaver’s, 
all of the above research is concerned with "accuracy" of judgment. Although 
this is a question certainly worthy of consideration, not enough of these 
studies raise significant questions about the effects of personality 
biases (attitudes, values, beliefs, etc.) on the outcome of Delphi fore- 
casts. The need for further research into the effects of personality has 
certainly been demonstrated and some directions are beginning to be clari- 
fied. 



What we know about how the mind constructs images of the future re- 
mains rather puny, but the fundamental assumptions which are generally 
held about Delphi seem questionable. For instance, the Delphi technique 
was created to prevent professional status and high position from forcing 
judgments in certain directions when panels of experts met. The intention 
was to assure that through questionnaires, changes in estimates would 
reflect rational judgment, and therefore not be subject to social psycho- 
logical factors. Empirical evidence tends to show the naivete of such 
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an assumption. Experimental evidence 0 clearly demonstrates that using 
a questionnaire technique to generate information feedback does not elimi- 
nate the effects on conformity that one observes under group pressure. 

Those persons who tend to conform under group pressure seem to do so even 
when the norm which attracts them is the statistical averaging opinion 
from a questionnaire. Furthermore, the conformist (in both types of con- 
ditions) tends to be more submissive, more anxious, more authoritarian, 
less intelligent, less theoretical, less realistic, and more emotionally 
reaccive. The conclusions from this literature generally tend to emphasize 
the role of motivational systems in explaining conformity. The differences 
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in the way people seek and use information feedback clearly follows per- 
sonality patterns — regardless of whether conformity is induced by group 
pressure or questionnaires. 



Three independently conducted studies suggest that within the Delphi 
procedure individuals who "swing" in from wide ranges to more narrow ranges 
do so less on the basis of rational argument, examination of evidence, or 
review of assumptions, than because decision-making strategies of certain 
persons are subject to change as the task is perceived to be less ambigu- 
ous, and on account of certain personality factors such as fundamental 
needs and integrative complexity. These findings, of course, are not unex- 
pected, and generally support the studies of several other investigators. 

The propensity to conform might be distinguished as follows in the litera- 
ture. Conformity to a group norm of unanimous peers who have expressed a 
judgment which is in obvious contradiction to logic and reason ought to be 
and is associated with personal and motivational attributes: timidity, 

deference to others, central needs, needs for approval. On the other hand, 
conformity to unanimous norms in ambiguous situations which defy logical f 
and reasonable argument should be not only associated with the above attri- 
butes, but also with informational-handling conventions such as persistence 
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in seeking closure and external locus of control. Indeed in the EPRC 
experiments we found just that. 

It also seems clear that subjective judgments of even very complex 
or abstract thinkers may be considerably influenced by their feelings of 
desirability regarding the future events in question. The assumption that 
experts, who may be presumed to be complex thinkers, bring to bear "cool 
analysis" in their judgments about the future, is questionable in light 
of our findings. ^ 



Still focusing for the moment on process, just what do we know about 

how people think about the future? From the research reviewed, in earlier 
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papers I have drawn the following summary observations. 
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The psychological studies of future perspective and personality traits 
strongly suggest that concepts and perceptions held about self and others 
are interrelated and reflective of thoughts about the future. Conceptual 
level, alienation, anxiety, social deviancy, emotional instability, and 
schizophrenia — all powerful indicators of particular ways of perceiving 
and relating to society — impinge upon one’s future cognition. Numerous 
studies showed that these indicators were sufficiently strong to distin- 
guish perceptions about the future, particularly when such perceptions 
involve estimating how long something would take or involve foreseeing 
some state or states of affairs. 
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It follows that persons with different kinds of "self structures" 
(needs, attitudes, beliefs, etc.) would hold different perceptions about 
the present as well as the future, and thus produce different kinds of 
forecasts about the future. This statement appears to be rather evident. 
How to shape it into a researchable set of questions is not as evident 
because exogenous variables also impinge upon judgments. For instance, 
the phrasing or complexity of a question, or the influence of a group 
norm, even though it may be anonymous, influences the judgments of certain 
people. Whether or not the judgmental task is vague or uncertain, or is 
perceived to be vague or uncertain, may also influence particular people 
to a considerable degree. 

Research questions on forecasting methods must begin to reflect some 
consideration of the interaction between dispositional factors and the 
conditions in the experiment. Among the more important questions are how 
do differences in judgments about the future reflect differences in the 
self-structure of the people who make the judgments? And consequently, 
how will differences in estimates be shaped by exogenous variables such as 
complexity and ambiguity of the task? The failure to consider these ques- 
tions is a persistent weakness of most Delphi studies to date. 
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Some Conclusions 



Delphi, like the future it was intended to foretell, has not turned 
out to be what we expected. It displays certain fundamental weaknesses in 
its present form as a forecasting tool. Briefly, they have to do with 
interpreting the significance of convergence of opinion under the condi- 
tions imposed by Delphi. The observation that people tend to shift their 
estimates toward a group norm under conditions of iteration is, on the 
basis of several controlled experiments with Delphi, a consistent and sound 
observation. There is some very meager evidence which suggests that com- 
pression of estimates over rounds produces a final consensus clcoer to the 
"true" answer (when the consensus is taken as a median of the spread of 
estimates). This finding, however, is based upon evidence collected from 
very short-term predictions, in the economic domain, and from experiments 
with almanac-type questions. Just how accurately the findings can be 
generalized to Delphis which cover a 30-year extension into the future is 
unknown. Moreover, to make such a generalization is irrelevant to an under- 
standing of plausibility as discussed earlier. Yet interpreting the social 
psychological significance of the convergency that does occur with such 
opinion is important in understanding how the mind processes information 
about the future. Once we can understand more clearly how the mind formu- 
lates images of the future, we will be in a better position to improve upon 
the process of constructing rational and plausible forecasts. 

Any consideration of the future of education should attempt to clarify 
what we can reasonably expect to make happen or not expect to make happen. 
Rather than a focus on "accuracy," the focus might better be on "plausibil- 
ity" or reasonableness of forecasts. In that sense Delphi at present comes 
up short because there is little emphasis on the grounds or arguments which 
might convince policy makers of the forecasts' reasonableness. There are 
insufficient procedures to distinguish hope from likelihood. Delphi at 
present can render no rigorous distinction between reasonable judgment and 



mere guessing; nor does it distinguish clearly priority and value state- 
ments from rational arguments, nor feelings of confidence and desirability 
from statements of probability. 

Of equally great importance, however, our research also leads us to 
conclude that Delphi, in combination with other tools, is a very potent 
device for teaching people to think about the future of education in much 
more complex ways than they ordinarily would. When we understand this 
use of Delphi, we may find that it is a useful instrument for something 
more important than what it was designed for, viz. , a general teaching 
strategy. What this means is that initially the way to get educators to 
make better decisions — decisions which account for alternative future con- 
sequences — is to enhance Lueir capacity to think in complex ways about the 
future, and Delphi seems ideally suited to such a purpose. Indeed, educa- 
tors may find in Delphi and other forecasting tools a better pedagogy. 



IV. 



MODIFYING DELPHI 



Rationale 



There are two particular points of focus in modifying Delphi. 

They are related. For one thing, in the reporting of a Delphi forecast 
the propositions of its authors are intended to be persuasive, although 
they carry no particular means for making them persuasive. Secondly, 
convergence of estimates from round to round is often confused with 
reasoned agreement. Let us develop this last point a little further. 

When feedback consists merely of a distribution of dates or . 
ranks, convergence which occurs carries no rational justification. 

This is not to say there are no reasoned changes connected with con- 
vergence; we are merely pointing out whatever the reasons, they are 
generally unknown. Nor is this to say that respondents, in finding 
others disagree, might not rethink their own position and alter it to 
more closely approximate the norm. But to make that claim is not 
the same as saying the outcome of the Delphi forecast was influenced 
by the singular or combined weighing of the arguments put. forth by 
members of the Delphi panel. Those arguments simply remain hidden. 
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Thus, our purpose in modifying Delphi is to force attention away 
from its propositions, whatever they are, and to focus on the elaboration 
of underlying assumptions and explanations. Whatever the outcome of a 
Delphi exercise, we want to be able to conclude it was influenced by the 
combined weight of the arguments presented. In this sense, we are talking 
more of a pedagogy than forecasting; therefore, we are not so much inter- 
ested in the claims people make about the future as how they support those 
claims and what they learn from each other in the process. 

In this view, what are the basic aspects of Delphi that need changing? 
First, there must be a redirecting of activity in Delphi studies away f rot 
mere description of specific events to explaining why such descriptions 
are reasonable, and why they, rather than others, are the most significant 
considerations to think about. 

Second, there must be a process whereby authors of forecasts not only 
explain why their forecasts are to be taken as reasonable but also why 
they might be expected to occur sooner rather than later. Third, by using 
those kinds of explanations as feedback, rather than using a statistical 
averaging of dates or ranks, one would reveal for the total group what 
each member states he expects and how that expectation is justified. 

These changes are crucial if Delphi is to survive the first blush of 
enthusiasm its proponents have generated. However, such changes in Delphi 
do not assure forecasts will be more accurate, nor that goals will be 
right. They simply provide _hat the reasons men give for s”ch assur- 
ances are stated. 

While the concept of Delphi forecasting needs to be changed, it also 
needs to be expanded. It is not clear now how the outcome of a Delphi 
study is influenced by the panelists' considerations of how events are 
interrelated. Nor is it clear how the forecast is shaped by consideration 
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of intervening events and long-term trends not specifically mentioned by 
the panel. Mechanisms for bringing out these considerations must be added 
to Delphi. However, simply to add the cross-impact matrix (CIM)^ in its 
present form is insufficient. Indeed, to do that compounds the problem. 
Why? 



The Delphi method alone raises enough questions of a sufficiently 
high order of inference to demand some considerable explanation. If one 
goes beyond that level of inference and asks not only what is the proba>- 
bility of X occurring at a given date, but asks whether as a result Y will 
occur, then the complexity as well as the importance of the explanation 
becomes paramount. Delphi and Cross-Impact Matrix techniques raise ques- 
tions demanding complex levels of inference and explanation. Yet, they 
lack the necessary mechanisms to elicit and present explanations, argu- 
ments, or underlying assumptions that would allow reasoned evaluation of 
their results. Therefore, their results cannot be assessed as valid or 
invalid, plausible or implausible. Because of that, Delphi and CIM lack 
potency as policy instruments. They simply are not convincing in their 
present form. 

To my knowledge, these needed changes were recommended for the first 
time by this writer in spring, 1969: 

"It may be more consistent with information processing theory to 
eliminate consensus forcing procedures altogether from forecasting and 
substitute feedback consisting of (a) assumptions, (b) causal factors, 

(c) evidence, or (d) theoretical bases. After several rounds of exchanging 
bases of judgments, rather than opinions, the estimates of individuals 
could be statistically averaged. To establish consistency one would hope 
to find that several persons, using the same information, reached similar 
inferences independently. Perhaps more specifically consistency could be 
established by analyses of variance in which certain factors could be 
controlled. Consistency, based upon common assumptions, evidence, causal 



factors and theory, would appear to present a more plausible method for 
generating future expectations than forcing consensus of opinion." 

In this view there is no empirical justification for keeping 
opinion-makers anonymous (see earlier discussion) , just as there is no 
empirical justification for using dates or probabilities as feedback. 
Furthermore, there is no logical or empirical justification for seeking 
convergence under the conditions imposed by Delphi studies . 

In modifying Delphi, we should shift entirely away from the idea 
that convergence improves the accuracy of a forecast. What we need is 
an instrument to aid the process of clarifying our own assumptions and 
arguments about the future, as well as those of others. Therefore, there 
is no reason to use an anonymous questionnaire technique, except perhaps 
as an evaluative tool to show how changes in estimates reflect the effects 
of the various arguments presented. Our recent uses of Delphi have been 
to provide a hypothetical situation to an audience for the purpose of 
discussing the assumptions that accompany certain claims about the future. 

Based on these considerations, a Delphi exercise was developed and 

conducted at the International Adult Education Seminar, Syracuse Univer- 
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sity, in December 1969. Participants from the several countries repre- 
sented at the Conference were first asked to judge when they thought cer- 
tain events might occur. The events were hypothetical statements con- 
structed from research in progress at the Educational Policy Research 
Center. Initial judgments of when the events were expected to occur were 
recorded but not revealed to the group. Small groups were formed and 
each was asked to discuss, as a working team, several alternative factors 
that might inhibit or lessen the chances of an event occurring. They were 

then asked to generate several alternative factors that might increase the 
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likelihood an event would occur. People in each of the groups then 
discussed the arguments wit the entire audience. Finally, a second round 
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was conducted estimating the expected dates of occurrence for the events 
discussed. Shifts in dates were evident but no statistical treatment was 
made. During the course of teaching a graduate seminar, Summer, 1970, 
additional refinements were made in Delphi. 

Below is a listing of specific modifications recommended for the 
Delphi technique by this writer, based on the various experiences dis- 
cussed above. 

1. Familiarity 

Participants should jud^e their familiarity with the topics under 
consideration. (Ament and Gordon used a familiarity scale, at 
the suggestion of this writer, in a Delphi conducted at IFF, 
sponsored jointly by EPRC, the State of Connecticut, and others.) 

Our research shows an effect on the outcome of Delphi studies 
from familiarity of participants (see Weaver, 1969) . 

2. C-W Factor 

The assignment of probability factors to the occurrence of 
events has been dropped altogether for now. Instead, estimates 
consist simply of an earliest and latest judgment as to when a 
condition might reach some recognizable proportion. This seems 
to be the best solution to the confusion between personal con.” 
fidence and objective probability, as well as a recognition 
that when people make judgments, they establish in their minds 
some set of parameters, or what the psychologists call category 
„ width (C-W) (see Weaver, 1969) . 

3 . Pedagogy 

Delphi is probably best used as a "conferencing" device where 
discussion of long-range options might tend to lack focus and 
organization; therefore, there is no need for the participants 
to remain anonymous but instead the T ‘ should be able to confront 
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each other over issues and assumptions. 



Focus on Explanation 

Participants should be asked to consider and explain why each 
of several hypothetical conditions might have importance were they 
to occur; attention is directed but not limited to considerations 
of magnitude or proportion of the event, and its potential impact 
on other events. 

Focus on Underlying Assumptions and Factors 

Participants ought to consider at the minimum two sets o "actors 
which might influence the actual occurrence of the events — both a 
set of negative factors and a set of enhancing factors. 

Desirability 

Participants should weigh the desirability of the events in ques- 
tion, and explain whether their views are connected to critical 
human issues and values, personal considerations, etc. 

Feedback 

Feedback should consist of the assumptions and arguments generated 
in (3), (4), (5), and (6) above; the format is open discussion 
within small groups followed by discussion among groups. (It 
should be noted that Berghofer used feedback of this sort in his 
study after consulting an earlier EPRC report.) 

Convergence 

Convergence or divergence which occurs aftei feedback ought to 
be taken as an indicator of the force of arguments and clarify- 
ing of points of view; in this view it is assumed to ha\ e nothing 
to do with the accuracy of events in question. 



The modifications specified above have been incorporated in seminars 

conducted at the National Educational Technology Conference, New York 

City, March 1971; ^ Futures Training Project for the State Department of 
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Education, State of Vermont, Spring 1971; New York State System Re- 
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design Project, Cassadaga Valley Schools, Spring 1971. 

Finally, it should be clear to the reader that when we speak of 
Delphi as a pedagogical tool, we do not mean Delphi without substantial 
changes. In its essentially pure form, Delphi has the same weaknesses 
as a teaching tool that it suffers as a forecasting tool, namely a lack 
of explanatory power. 
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NOTES 






1. For a detailed description of the original use of Delphi in technologi- 
cal forecasting see 0. Helmer, Social Technology . New York: Basic 

Books, 1966. For a 1 ' jcussion of the original assumptions underlying 
the epistemology ^eJ.phi see 0. Helmer and N. Rischer. "On the 
Epistemology oi -ue Inexact Sciences." Management Science , 1959, 6, 
25-52. 

2. See for instance T. J. Gordon and R. H. Ament, Forecasts of Some Tech - 
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Middletown, Conn.: Institute for the Future, R-6 , September, 1969. 
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4. While goals may be perceived by some to be important only because 
they are personally relevant, the same goals may be held by others to 
be important on moral grounds. It seems crucial to know which. For 
instance, A may not rank a goal, say, achieving racial balance in the 
schools as important, because it does not affect him personally. 

B may rank the goal high, not because it affects him personally, but 
because it is important on some moral consideration. C on the other 
hand, may feel the goal is relatively unimportant, neither on moral 
grounds nor because of personal relevancy; instead he may feel it sim- 
ply does not affect a large enough proportion of society. 

5. For a discussion of these tools, see S. Sandow. "The Pedagogical 

Structure of Methods for Thinking about the Future: The Citizen's 

Function in Planning." Syracuse, New York: Educational Policy Research 

Center, Working Draft, August 1970. 
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6. This distinction is somewhat different than that made in R. V. Ayres, 
Technological Forecasting and Long-Range Planning . New York: McGraw- 
Hill, 1969. E. Jantsch. Technological Forecasting in Perspective . 
Paris: OECD, 1966 and J. R. Bright (Ed.) Technological Forecasting 
for Industry and Government . Englewood Cliffs, New Jersey: Prentice- 
Hall, Inc., 1968. Particularly, Jantsch makes the distinction between 
exploratory and normative forecasting techniques, also used in this 
report, and between intuitive and feedback techniques not used. The 
weakness in Jantsch' s distinctions is that a single technione, Delphi 
for instance, may be classed in all four of his categories. 

7. For research conducted on extrapolation and modeling techniques at the 
Educational Policy Research Center, Syracuse, see J. A. Henning and 

A. D. Tussing.. "The U.S. Economy Through 2000: Forecasts of Major 
Macroeconomic Variables," EPRC Working Draft, April 1971, A. D. Tussing 
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tures in the United States," EPRC, August 1970, J. A. Henning and 
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